15 research outputs found

    Acoustic Lexemes for Organizing Internet Audio

    No full text
    In this article, a method is proposed for automatic fine-scale audio description that draws inspiration from ontological sound description methods such as Shaeffer's Objets Sonores and Smalley's Spectromorphology. The goal is complete automation of audio description at the level of sound objects for indexing and retrieving sound segments within Internet audio documents. To automatically segment audio documents into acoustic lexemes, a hidden Markov model is employed. It is demonstrated that the symbol stream of cluster labels, generated by the Viterbi algorithm, constitutes a detailed description of audio as a sequence of spectral archetypes. The ASCII base-64 encoding scheme maps cluster indices to one-character symbols that are segmented into 8-gram sequences for indexing in a relational database. To illustrate the methods, the essential components of an audio search engine are described: the automatic cataloguer, the retrieval engine and the query language. The results of experiments that test the accuracy and the retrieval efficiency of six new similarity-matching algorithms for audio using acoustic lexemes are presented. The article concludes with examples of audio matching using the structured query language (SQL) for creating new musical sequences from large extant audio collections

    Tonal representations for music retrieval: from version identification to query-by-humming

    Get PDF
    In this study we compare the use of different music representations for retrieving alternative performances of the same musical piece, a task commonly referred to as version identification. Given the audio signal of a song, we compute descriptors representing its melody, bass line and harmonic progression using state-of-the-art algorithms. These descriptors are then employed to retrieve different versions of the same musical piece using a dynamic programming algorithm based on nonlinear time series analysis. First, we evaluate the accuracy obtained using individual descriptors, and then we examine whether performance can be improved by combining these music representations (i.e. descriptor fusion). Our results show that whilst harmony is the most reliable music representation for version identification, the melody and bass line representations also carry useful information for this task. Furthermore, we show that by combining these tonal representations we can increase version detection accuracy. Finally, we demonstrate how the proposed version identification method can be adapted for the task of query-by-humming. We propose a melody-based retrieval approach, and demonstrate how melody representations extracted from recordings of a cappella singing can be successfully used to retrieve the original song from a collection of polyphonic audio. The current limitations of the proposed approach are discussed in the context of version identification and query-by-humming, and possible solutions and future research directions are proposed

    Twelve-month-old infants’ physiological responses to music are affected by others’ positive and negative reactions

    No full text
    Abstract Infants show remarkable skills for processing music in the first year of life. Such skills are believed to foster social and communicative development, yet little is known about how infants? own preferences for music develop and whether social information plays a role. Here, we investigate whether the reactions of another person influence infants? responses to music. Specifically, 12-month-olds (N = 33) saw an actor react positively or negatively after listening to clips of instrumental music. Arousal (measured via pupil dilation) and attention (measured via looking time) were assessed when infants later heard the clips without the actor visible. Results showed greater pupil dilation when listening to music clips that had previously been reacted to negatively than those that had been reacted to positively (Exp. 1). This effect was not replicated when a similar, rather than identical, clip from the piece of music was used in the test phase (Exp. 2, N = 35 12-month-olds). There were no effects of the actor's positive or negative reaction on looking time. Together, our findings suggest that infants are sensitive to others? positive and negative reactions not only for concrete objects, such as food or toys, but also for more abstract stimuli including music
    corecore